Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sm_90a to enable use of accelerated wgmma and setmaxnreg instructions #913

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ConsceIeratus
Copy link

sm_90a adds support for accelerated wgmma and setmaxnreg instructions.

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#ptx-module-directives-target

@TimDettmers
Copy link
Collaborator

Thank you, this is good to know that wgmma is now added. I think Hopper supports both sm_90 and sm_90a. Since we do not make use of wgmma or setmaxnreg for now we would not need sm_90a. I would want to not add it at the moment to keep the binary a bit smaller. I am currently having troubles with the binary size since all binaries must be smaller than 100MB for PyPi uploads.

@TimDettmers TimDettmers added low priority (will be worked on after all priority issues) Low Risk Risk of bugs in transformers and other libraries labels Jan 2, 2024
@Titus-von-Koeller
Copy link
Collaborator

Thanks for raising this. We'll keep this in mind and implement it once we have figured out the current cross-platform + build + distribution topics.

@akx
Copy link
Contributor

akx commented Jan 30, 2024

For PyPI, you can request a quota increase for the project via https://github.com/pypi/support :)

@younesbelkada
Copy link
Collaborator

Nice - @akx would you mind sharing that in #990 as well 🙏

@Titus-von-Koeller
Copy link
Collaborator

For PyPI, you can request a quota increase for the project via https://github.com/pypi/support :)

Nice, that would unblock us on this before we've figured out the cross-platform compilation and distribution stuff.

I'll look into that today. Valuable input, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build low priority (will be worked on after all priority issues) Low Risk Risk of bugs in transformers and other libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants